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Abstract. We propose a novel numerical inversion algorithm for parabolic partial differential equations, based 
on model reduction. The study is motivated by the application of controlled source electromagnetic exploration, 
where the unknown is the subsurface electrical resistivity and the data are time resolved surface measurements of the 
magnetic field. The algorithm presented in this paper considers a layered medium. The reduced model is obtained 
with rational interpolation in the frequency (Laplace) domain and a rational Krylov subspace projection method. It 
amounts to a nonlinear mapping from the function space of the unknown resistivity to the small dimensional space 
of the parameters of the reduced model. We use this mapping as a nonlinear preconditioner for the Gauss-Newton 
iterative solution of the inverse problem. The advantage of the inversion algorithm is twofold. First, the nonlinear 
preconditioner resolves most of the nonlinearity of the problem. Thus the iterations are less likely to get stuck in local 
minima and the convergence is fast. Second, the inversion is computationally efficient because it avoids repeated 
computations of the time domain solutions of the forward problem. We study the stability of the inversion algorithm 
for various rational Krylov subspaces, and assess its performance with numerical experiments. 
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1. Introduction. Inverse problems for parabolic partial differential equations arise in 
applications such as groundwater flow, solute transport and controlled source electromagnetic 
oil and gas exploration. We consider the latter problem, where the unknown r(x) is the 
electrical resistivity, the coefficient in the diffusion Maxwell system satisfied by the magnetic 



for time t > and x in some domain f2. The data are time resolved measurements of H(i, x) 
at receivers located on the boundary of Q. 

Inversion is challenging specially because the problem is ill-posed and thus sensitive to 
noise. A typical numerical approach is to minimize a functional given by the least squares 
data misfit and a regularization term, using Gauss-Newton or nonlinear conjugate gradient 
methods [22, 25, 23]. There are two main drawbacks. First, the functional to be minimized is 
not convex and the optimization algorithms can get stuck in local minima. The lack of con- 
vexity can be overcome to some extent by adding more regularization at the cost of artifacts 
in the solution. Nevertheless, convergence may be very slow [23]. Second, evaluations of the 
functional and its derivatives are computationally expensive, because they involve multiple 
numerical solutions of the forward problem. In applications the computational domains may 
be large with meshes refined near sources, receivers and regions of strong heterogeneity. This 
results in a large number of unknowns in the forward problem, and time stepping with such 
large systems is expensive over long time intervals. 
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We propose a numerical inversion approach that arises when considering the inverse 
problem in the model reduction framework. In this paper we limit the study to the one di- 
mensional parabolic equation 



(1.2) f 

ox 



. .&u(t,x) 
r(x)- 



dx 



du(t, x) 
dt 



for x in the interval [0, 1]. It can be derived from (1.1) in the case of a layered medium in a 
slab Q = [0, 1] x R 2 , and for a magnetic field 

(1.3) H(i, x) = u(t, x)e z , x = xe x + ye v + ze z . 
The function u(t, x) satisfies the boundary conditions 

(1.4) *f° ) =.(M)=0, t>0, 
and the initial condition 

(1.5) u(0,x) = 8(x), 

corresponding to a point source at x = 0. 

The reduced model is obtained with a rational Krylov subspace projection method. These 
methods have been applied to forward problems for parabolic equations in [7, 10, 11, 14]. 
They have been introduced recently for inversion in [12, 20], where the reduced models are 
obtained using rational interpolation of the transfer function in the Laplace domain. Intro- 
duction of the reduced order models allows to replace the solution of the full scale parabolic 
problem by its low order projection, thus resolving one of the inversion challenges mentioned 
above, namely the one of high computational cost. 

The transfer function is the Laplace transform of the time domain response 

(1.6) y(t)=u(t,0), t>0, 

and the reduced model is its rational approximant. Conventionally, it is parametrized in terms 
of its poles and residues. Instead we suggest to use the parametrization in terms of the coeffi- 
cients of the continued fraction expansion of the rational approximant. This rational approx- 
imant itself can be viewed as the transfer function of a finite difference discretization of (1.2) 
on a special rather coarse grid, known in the literature as the optimal or spectrally matched 
grid, or as a finite-difference Gaussian rule [8]. The continued fraction coefficients are the 
entries in the finite difference operator, the discrete resistivities. Thus we define the nonlinear 
mapping Q from the function space of the unknown r(x) to the small dimensional space of 
the the discrete resistivities. That allows us to mitigate the other inversion challenge men- 
tioned above, as follows. Mapping Q takes the function r(x) to the discrete resistivities and 
it appears to resolve most of the nonlinearity of the problem. We use it as a preconditioner 
of the Gauss-Newton iteration, which is less likely to get stuck in local minima and con- 
verges quickly, even when the initial guess is far from the true r(x). This fact, combined with 
the small number of Gauss-Newton iterations leads to a computationally efficient inversion 
algorithm. 

The paper is organized as follows: We begin in section 2 with the formulation of the 
problem. The projection based model reduction and its use in inversion are described in 
section 3. We compare the performance of various types of reduced models for inversion in 
section 4. The details of the computation of the nonlinear mapping Q and its Jacobian are 
given in section 5. The outline of the inversion algorithm and the numerical results are in 
section 6. We conclude with a summary in section 7. 
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2. Problem formulation. To adhere to the conventional setting of model reduction, we 
consider the semidiscretized equation (1.2), written as 

(2.D ^ 

for the vector u(t) £ M. N of discrete values of u(t, x) on a fine uniform grid with N points in 
the interval [0, 1]. Here 

(2.2) A(r) = -L> T diag(r)L> 

is a symmetric and negative definite operator, the discretization of d x [r(x)d x ]. The vector 
r contains the discrete values of r(x) and D is the finite difference discretization of the 
derivative in x, for the boundary conditions (1.4). The initial condition is 

(2.3) u(0) = iei, 

the approximation of (1.5) on the grid with spacing h = l/(iV + 1), and the time-domain 
response of the discrete system is given by 

(2.4) y(t;r)=efu(t). 

It is the analogue of (1.6), and we emphasize in the notation that it depends on the vector r of 
discrete resistivities. 

The measured time domain data is denoted by 

(2.5) d(t) =y(t;r™) + £(*), t > 0, 

where £(t) is due to noise and the discretization errors. The vector r ImL in (2.5) is to be 
recovered from the given d(t). We estimate it by r*, the solution of the nonlinear optimization 
problem 



(2.6) r* = arg min - || Q(d(t)) - Q(y(t; r))\\ 2 2 + -V{v), 



over the admissible set of resistivities S C Mil. The optimization is regularized by adding a 
functional V(r) with a parameter a. For example, V(r) may be the total variation norm of r, 
or the square of the £2 norm of r. The map 

(2.7) Q : C(0, +00) -> R q 

is a nonlinear preconditioner that maps the time domain response functions to the space of 
dimension q <C N of the parameters of the reduced model. 

It is typical in inversion not to have a preconditioner, and therefore minimize the L2 
norm of the (weighted) misfit between the measured data d(t) and the model y(t;r). The 
main contribution of this paper is to introduce the nonlinear preconditioner Q so that the 
optimization problem (2.6) becomes well posed even in the absence of the regularization 
term V(r). All the instabilities are in the computation of Q(d(t)). We stabilize it by limiting 
the dimension q of the reduced model, depending on the level of noise. The resolution of the 
estimated resistivity depends on q, so there is a trade-off between the stability and resolution 
of the reconstruction. Note that since q <C N, the Jacobian T)Q of Q has a large null space 
and the vector of residuals is in the column space of T>Q. If we have prior information about 
r" m , we can use it to improve the reconstruction r* by adding corrections in the null space of 
T)Q that minimize the regularization term V(r) without affecting the residual. 
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3. Model order reduction and inversion. In this section we show how we can use 

the model order reduction framework to obtain the nonlinear preconditioner Q in (2.6). For 
this purpose we treat (2.1)-(2.4) as a dynamical system. We symmetrize the system for 
convenience by taking equal initial value and measurement vectors defined by 

an b = J=e, 

The transfer function of the dynamical system is the Laplace transform of the time domain 
response 

r+co 

(3.2) G(s;r)= y(t; r)e~ st dt = b T (sI - A(r)) _1 b, s > 0. 

Jo 

Note that since A(r) is negative definite, all poles of the transfer function are negative and 
the dynamical system is stable. 

In model order reduction we obtain a reduced model (A m , b m ) so that its transfer func- 
tion 

(3.3) G m (s) = hj n {sl m ~ A ro )" 1 b m 

is a good approximation of G(s; r) as a function of s in some norm. Here I m is the m x m 
identity matrix, A m e M mxm , b m £ M m and m <C N. The dimension q = q(m) of our 
reduced model is defined later in terms of m. Note that while the matrix A(r) given by (2.2) 
is sparse, the reduced order matrix A m is typically dense. Thus, it has no straightforward 
interpretation as some coarse discretization of the differential operator d x [r(x)d x ]. 

3.1. Projection-based model reduction. Projection-based methods search for reduced 
models of the form 



(3.4) A m = V T AV, b m = V T b, V T V — I m , 



where the columns of V £ K Arxm form an orthonormal basis of an m-dimensional subspace 
of R N on which the system is projected. 

The choice of V depends on the matching conditions for G m and G. They prescribe the 
sense in which G m approximates G. Here we consider moment matching at interpolation 
nodes (Xj € [0, +oo) that may be distinct or coinciding, 



d k G 

(3.5) 



ds k 



d k G 



ds k 



k = 0,...,2M j -l, j = l,. ..,1. 



The multiplicity of a node (Xj is denoted by Mj, so at non-coinciding nodes Mj — 1. The 
reduced order transfer function G m matches G and its derivatives up to the order 2Mj — 1, 
and the size m of the reduced model is given by 

I 

Note from (3.3) that G m (s) is a rational function of s, so (3.5) is a rational interpolation 
problem. The function G m (s) admits the partial fraction representation 

m 

(3.6) G ™( S ) = E^V' C J >0 ' ^ >0 - 
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Its poles — 9j are the eigenvalues of A m and the residues Cj are defined in terms of the 
normalized eigenvectors Zj , 

(3.7) cj = (b^Zj) 2 , A m Zj + 6jZj=0, ||zj-|| = 1, j = l,...,m. 

For the reduced models (3.4), the conditions (3.5) are known to be equivalent to the columns 
of V being an orthogonal basis of the rational Krylov subspace 

(3.8) fC m (a) = span {{a 3 I - A)~ k h \ j = 1, . . . ,/; k = 1, . . . , Mj} . 

The interpolation nodes, obviously, should be chosen in the resolvent set of A(r). More- 
over, since in reality we need to solve a limiting continuum problem, the nodes should be in 
the closure of the intersection of the resolvent sets of any sequence of the finite-difference 
operators convergent to the continuum problem. This set is C \ (0, oo] for the continuum 
problems on bounded domains and C \ [0, oo] for the unbounded ones. 

Ideally, in the solution of the inverse problem we would like to minimize the time (or 
frequency) domain data misfit in a quadratic norm weighted in accordance with the statisti- 
cal distribution of the measurement error. When considering the reduced order model, it is 
natural to choose the interpolation nodes that give the most accurate approximation in that 
norm. Such interpolation is known in control theory as H2 (Hardy space) optimal, and in 
many cases the optimal interpolation nodes can be found numerically [18]. Moreover, it was 
shown in [12], that the solution of the inverse problem using reduced order models with such 
interpolation nodes also minimizes the misfit functional (in the absence of measurement er- 
rors). When such nodes are not available, we can select some reasonable interpolation nodes 
chosen based on a priori error estimates, for example, the so-called Zolotarev points or their 
approximations obtained with the help of potential theory [10, 24]. In most cases such choices 
lead to interpolation nodes distributed geometrically. 

3.2. Inversion via model reduction. Inversion via model order reduction was first in- 
troduced in [12, 13, 20]. In particular, the approach of [12] produces the preconditioner 
Q pr {y( ' i r )) given by the following chain of mappings 

(3.9) Q P Av(-;r)): r 4 A(r) 4 V 4 A m 4 {( Cj , ^)}f =1 , 

where (a) is given by (2.2), (b) involves orthogonalization of a basis for /C m (er), (c) is simply 
(3.4) and (d) is obtained from (3.7). Thus, Q pr (y( ■ ; r)) maps r G to a vector of residues 
and poles (ci, . . . , c m , 61, ... , 6 m ) T £ E 1 ^ with q = 2m. Since the poles and residues do 
not have a physical meaning of resistivity, Q pr cannot be an approximation of identity, as we 
would expect a good nonlinear preconditioner to be. 

Here we propose to modify (3.9) using ideas from [4, 17]. We add another mapping to 
the chain so that the resulting operator Q(y( ■ ; r)) approximates the identity. For this purpose 
observe that the positive residues and poles {(cj, 9j)} 1 J l =1 define a rational function (3.6) 
which admits a representation as a Stieltjes continued fraction with positive coefficients 

(3.10) G m {s) = - — x . 

KlS H j 

Kl H j 

%s H 
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The function (3.10) is known [8] to be a boundary response function w\ (s) of a second order 
finite difference scheme with a three point stencil 

1 ( Wj + i — Wj Wj — Wj-i 



(3.11) =H— --— J —)-8W j = 0, j = 2,...,m, 

Kj \ Kj Kj-1 J 

and boundary conditions 

c\ 1 1\ I f w 2 - Wl \ 1 

(3.12) — - swi + — = 0, w m+1 = 0. 

These equations resemble the Laplace transform of (2.1), but here the discretization is at 
much fewer nodes than N. The coefficients Kj and Kj have the meaning of resistivity and 
inverse resistivity, respectively. 

The chain of mappings for computing the nonlinear preconditioner is 

(3.13) Q(y( ■ ; r)) : r 4 A(r) 4 V 4 A m 4 {(c^)}^! 4 %)}f =1 , 
and the dimension q of the space of parameters of the reduced model in (2.7) is given by 

(3.14) q = 2m. 

The mapping (e) can be computed with a variant of the Lanczos iteration, as explained in 
section 5. Alternatively, the composition of (d) and (e) taking A m to the continued fraction 
coefficients can be computed directly via a Lanczos iteration. 

The computation of Q{y{ - ;r)) for a trial r via (3.13) is a stable procedure. How- 
ever, the objective function in (2.6) also contains the term Q(d(t)) whose computation is 
ill-conditioned. This instability is unavoidable, it comes from the ill-posedness of the inverse 
problem. We mitigate it by reducing the dimension q of the reduced model depending on the 
noise level. In practice we need to compute a Pade approximant of the Laplace transform of 
d(t) in order to obtain Q(d(t)). The approximant must use the same matching conditions as 
the model reduction scheme used to compute Q(y( ■ ; r)). The details of the computation of 
Q(d(t)) are given in section 4.3. 

4. Matching conditions for inversion. Obviously, a good nonlinear preconditioner Q 
should be an approximation of the identity on some space. Moreover, to have a good re- 
construction of r(x), the continued fraction coefficients must correspond to the values of 
the resistivity at spread out locations in the domain [0, 1]. In this section we study how this 
property depends on the matching conditions (3.5). 

A good measure of how close Q is to the identity is the condition number of the Jacobian 
of Q with respect to r, denoted hereafter by VQ £ E 2mxJv . It is defined by 

8 k j 

— — , if 1 < j < m 
ar k 

(VQ) jik = \ ^ , k = l,...,N. 

, if m + 1 < j < 2m 



The computation ofDQ requires differentiation of all the mappings in (3.13), and follows 
from the chain rule. We describe it in detail in section 5. 
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4.1. Connection to optimal grids and sensitivity functions. Our inversion method is 
based on ideas developed for the inverse spectral problem in [4] and for Electrical Impedance 
Tomography in [3, 5, 6, 21]. They use the concept of optimal grids that was originally in- 
troduced in [2, 8, 9, 19] to obtain exponential superconvergence of approximations of the 
Dirichlet-to-Neumann and Neumann-to-Dirichlet maps. While the method proposed here is 
not directly based on the use of optimal grids, it is beneficial to study them nevertheless, since 
they provide insight into the behavior of the nonlinear preconditioner (3.13). Good choices 
of matching conditions for step (b) of (3.13) give good distributions of the grid nodes, and 
therefore good reconstructions of r(x) throughout the domain. 

The optimal grids are usually defined by the coefficients of the continued fraction corre- 
sponding to a reference constant resistivity A°\ In principle we can define grids for another 
reference resistivity, not necessarily constant. But the results in [3, 4, 5, 6, 21] show that the 
grids change very little with respect to the reference A°\ This property makes them very 
useful in inversion. 

Let then A°\x) = 1 and note from (3.11)-(3.12) that 

( K f)^f)™ 1 = Q( y (.,r(°))), r(°) = (l,l > ...,lf J 
are grid steps in a finite difference scheme for the equation 

d 2 w 



sw = 0. 
ox z 



The primary and dual grid points are 



k=l k=l 

with boundary nodes x^ — x^ ~ 0. In the numerical experiments we observe that the grid 
is staggered, i.e. the primary nodes xf and the dual nodes x^ obey interlacing conditions 
(4.2) 

n - r {0) - r (0) < t (0) < r (0) < 9 (0) < r (0) < < r (0) < r (0) < < r<W < 1 

We do not prove (4.2) here, although it is possible to do so at least in some settings [5]. 

The optimal grids are closely connected with the sensitivity functions, which are given in 
our semi-discrete setting by the rows of the Jacobian T>Q. The studies in [3, 6, 17, 21] show 
that the sensitivity function corresponding to Kj is localized around the corresponding grid 

cell ($j , and its maximum is near x^. The same holds for Kj, after interchanging 
the primary and dual grid nodes. Moreover, the columns of the pseudoinverse {T>Qy have 
similar localization behavior. The Gauss-Newton update is the linear combination of the 
columns of (T>Qy, so optimal grids are appropriate for inversion. They localize well features 
of the resistivity that are recoverable from the measurements. 

4.2. Optimal grids for different matching conditions. We expect a good grid to have 
two properties. First, the grid nodes should not be clustered. If two nodes get too close 
then the corresponding rows of T>Q become almost linearly dependent, and the Jacobian is 
ill-conditioned. In practice the nodes that tend to get closer are x± — and x± — 
Kj°\ Second, the optimal grids should be refined near the measurement location to capture 
correctly the loss of resolution away from the boundary. But to have a good reconstruction 
throughout the domain, the nodes should not all be clustered at the boundary. 
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FIG. 4.1. Primary x\ ' and dual x\ ' optimal grid nodes for j = 1, . . . ,m(m = 5, 10) and different choices 
of matching conditions. Moment matching at s = 0.' primary X, dual o. Interpolation at geometrically distributed 
interpolation nodes: primary * and dual □ for slowly growing <r; primary * and dual v for fast growing cr. The 
number of fine grid steps in the semi-discretized model is N = 1999. 

Let us illustrate the grids for three choices of matching conditions (3.5). The first choice 
(simple Pade) corresponds to I = 1, a± = and M\ — m and yields the rational Krylov 
subspace 

/C m (0) = span {A~% A~ 2 b, . . . , A" m b} . 

The simple simple Pade approximant has the best accuracy near the expansion point (s = 
0), and obviously will be inferior for global approximation compared to properly chosen 
multipoint Pade aproximants. 

The second and third choices match G(s; r) and its first derivative at the nodes er = 
(&j)JLi distributed geometrically 

(4.3) aj = cti {^j i j = !,..., m, 

which yields the rational Krylov subspace 

fC m (a) = span {(5x7 - A^b, (a m I - A) _1 b} . 

We use a slower rate of growth for the second choice and a faster rate for the third choice. The 
slower rate corresponds to the optimal geometric progression [19] approximating Zolotarev 
nodes, which in turn give an optimal approximant based on the spectral interval of A(r). We 
expect the slower rate to give the best results between all three choices, which is confirmed 
in the numerical experiment below. 

We show the optimal grids for all three choices of matching conditions in Figure 4.1 for 
reduced models of sizes rn = 5, 10. We observe that the nodes of the grids corresponding 
to fast growing <x are clustered too close to the measurement point x = 0. Thus, inversion 
results based on matching conditions with fast growing interpolation nodes are expected to 
have poor resolution throughout the rest of the domain away from the origin. 

The clustering of grid nodes around the origin also leads to poor conditioning of the Ja- 
cobian T>Q. This is illustrated in Figure 4.2, where the condition numbers are plotted against 
the size m of the reduced model. We observe that the condition number of the Jacobian for 
the reduced model with fast growing <r increases exponentially. The condition numbers of 
the Jacobians for the other two choices of matching conditions grow very slowly. 
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FIG. 4.2. Dependence of the condition number ofDQ on the size m of the reduced model for different choices 
of matching conditions. Moment matching at s = is o, interpolation at geometrically distributed nodes: slowly 
growing a is □, fast growing <r is xj. The number of fine grid steps in the semi-discretized model is N = 1999. 



We can explain the poor performance in case of fast growing interpolation nodes if we 
view it as being close to the limiting case of having an approximation at infinity. Simple Pade 
approximant at infinity corresponds to Krylov subspace 

/C m (+oo) = span {b, Ah,..., A m - l h] 

which is unsuitable for inversion in our setting. To see this, recall from (2.2) that A(v) is 
tridiagonal and b is along ei. Thus, for any j € Z + only the first j + 1 components of 
the vector A^h are non-zero. When A(r) is projected on K m (+oo) in (3.4), the reduced 
model matrix A m is aware only of the upper left (to + 1) x (m + 1) block of A(r), which 
depends only on the first to + 1 entries of r. This is unsuitable for inversion where we want 
A m to capture the behavior of the resistivity in the whole interval, even for small to. The 
corresponding optimal grid steps will simply coincide with the first to grid steps h of the fine 
grid discretization in (2.2). 

If the interpolation nodes grow too quickly, we approach the limiting case of simple Pade 
at infinity, and thus for the first optimal grid steps we have 

(4.4) « 4 0) « h. 

' dn \ N ( 8k \ N 
This causes the rows ( — — I and ( — — ) of V Q to be almost collinear, and thus the 



dr k J k=l \dr k/k=1 
poor conditioning of the Jacobian observed in Figure 4.2. 

The geometrical distribution that we find to work particularly well in our setting is 



(4.5) (J, = oi ^1 + — J , j = 1, ... ,m, 

where we take a\ = 2 and C = 12. 

4.3. Osculatory rational interpolation problem. Solving the inverse resistivity prob- 
lem via (2.6) requires computing 

(4.6) ( % ,%)™ =1 = Q(d(-))- 
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This is the ill-conditioned step of the inversion method. It comes from the ill-posedness of 
the inverse problem and we mitigate it by limiting m. In practice we choose m so that the 
resulting Kj and Kj are positive. 

Unlike Q(y( ■ ; r)) which can be computed using the chain of mappings (3.13), the com- 
putation of Q(d( ■ )) requires solving an osculatory rational interpolation problem. This in- 
volves the solution of a linear system of equations with an ill-conditioned matrix, or taking a 
singular value decomposition of such a matrix. We use the condition number of the matrix to 
assess the instability of the problem. The condition number grows exponentially with m, but 
the rate of growth depends on the matching conditions used in the rational interpolation. In 
this section we study the two choices of matching conditions: (1) interpolation of G(s; r) and 
its first derivatives at distinct nodes <r distributed as in (4.5) (multipoint Pade approximation), 
and (2) matching of moments of G(s; r) at s = (simple Pade approximation). 

4.3.1. Multipoint Pade approximation. Let us write the reduced order transfer func- 
tion as a ratio 



(4.7) 



G m (s) 



Po +PiS + 

qo + qis 



Pm-lS 



Note that there is redundancy in the definition (4.7), since the coefficients of p and q may be 
multiplied by a common factor. This redundancy is used later to choose a unique solution of 
an underdetermined problem. 

We rewrite the matching conditions (3.5) of G(s) and G'(s) at the distinct interpolation 
nodes < o\ < oi < . . . < a m as 



. , m. 



(A 9,\ { P^o) - G m {<Jj)q{cr ) =0 

(4 - 8) \ p'(a j )-G' m (a j )q(^)-Gm(a j )q'(a j ) = ' J l > 

We can improve the conditioning of the interpolation by introducing the change of variables 



(4.9) 



6(0,1]. 



Let us define the Vandermonde-like m x (to + 1) matrices 



(4.10) S = 



1 cti a\ 
1 g 2 a\ 



1 a,. 



1 2oi 

1 2(7 2 

1 2a m 



and the diagonal matrices 
(4.11) 

F = diag (G(5 1 ),G(a 2 ), 



'i 

_m— 1 



,G'(a m )). 



,G{a m )), F' = diag(G'(5 1 ),G'(a 2 ), 
Equations (4.8) can be written in matrix- vector form as an underdetermined problem 
(4.12) R m w = 0, weM 2m+1 , 

where 



(4.13) 



J l:m. l:m 



^1: 



m. l;m 



-FT, 
-F'E - FT' 



i2mx(2ra+l) 
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and 



(4.14) 
(4.15) 



Py 



vjwj+ii i = 0,...,m-l, 



qj = a m 3 w j+m+1 , i = 0, 



In (4.13) we use Matlab notation to pick the first m rows and columns of £ and The 
problem is underdetermined because of the redundancy in (4.7). We eliminate it by the ad- 
ditional condition ||w||2 = 1, which makes it possible to solve (4.12) via the singular value 
decomposition 



(4.16) 



R„ 



USW T , w = W 1: 



(2m+l), 2m+l- 



Once the polynomials p and q are determined from (4.14)-(4.15), we can compute the 
partial fraction expansion (3.6). The poles —6j are the roots of q(s), and the residues Cj are 
given by 



q m EI i®k 

k=l 



3 = 1, • 



,m, 



assuming that 6j are distinct. Finally, Kj and Kj are obtained from 6j and Cj via a Lanczos 
iteration, as explained in section 5. 

4.3.2. Simple Pade approximation. When I = 1, u\ = and M x = m, we have a 
simple Pade approximant which matches the first 2m moments of y(t; r) in the time domain, 
because 



s=0 



(-iy 



+oo 



y(t;r)t 3 dt, j = 0, 1, . . . ,2m - 1. 



A robust algorithm for simple Pade approximation is proposed in [16]. It is also based on the 
singular value decomposition. If G(s) has the Taylor expansion at s = 



_ „2m-l , 

r 2m-l s + • • 



(4.17) G( S ) =T Q +T lS + T 2 S 2 + . 

then the algorithm in [16] performs a singular value decomposition of the Toeplitz matrix 



(4.18) 



T 

-*- rr 



7~i n 



Tl 
T2 



T~2m-1 T 2m-2 



T 

n 

T m -l 



cmx (m+1) 



If T m = USW T , then the coefficients of the denominator and numerator of (4.7) satisfy 
(4.19) 

t •■•00 
n t •••00 







f Po \ 




qi 




Pi 






= Wi:( m+ i) )m+ i, and 






\q m ) 




\Pm-l) 





Tm — 1 T m ^2 



T 





(qo\ 




qi 




\q m J 



We refer the reader to [16] for detailed explanations. 
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4.3.3. Comparison of simple and multipoint Pade approximants. To compare the 
performance of the oscillatory interpolation procedure in section 4.3.1 to that of the simple 
Pade approximation algorithm in section 4.3.2, we compare the condition numbers of the 
matrices R m and T m . They are computed for the reference resistivity r-(°), for several values 
of the size m of the reduced model. The results are in Table 4. 1 . We observe that while both 
condition numbers display exponential growth, the growth rate is slower for the multipoint 
Pade approximant. Thus, we conclude that it is the best of the choices of matching conditions 
considered in this section. It allows a more stable computation of Q(d( ■ )), it gives a good 
distribution of the optimal grid points and a well-conditioned Jacobian T>Q{y{ ■ ; r)). 

Table 4.1 

Condition numbers of Rm (multipoint Pade approximant) and T m (simple Pade approximant). The number of 
fine grid steps in the semi-discretized model is N = 299. 



m 


2 


3 


4 


5 


6 


cond(i? m ) 


4.43 • 10 2 


6.73 • 10 4 


1.85 • 10 Y 


6.95 • 10 9 


3.83 • 10 12 


cond(T m ) 


5.28 • 10 1 


1.26 • 10 5 


1.84 • 10 y 


9.14- 10 13 


2.86 • 10 ie 



5. Nonlinear preconditioner and its Jacobian. The computation of the nonlinear pre- 
conditioner Q(y( ■ ; r)) and its Jacobian is the most complex and time consuming computation 
in our inversion scheme. Nevertheless, it is much more efficient than the traditional inversion 
approach because it avoids the repeated computation of the time domain solution of the for- 
ward problem. In this section we explain the details of the computation of Q and T>Q via 
the chain of mappings (3.13). We do so only for the multipoint Pade approximant, which we 
showed above to be best suited for inversion. 

(a) The matrix A is defined by (2.2) for a given r. Differentiating (2.2) yields 

dA 

(5.1) — = -D T e k e T k D = d k d T k , k=l,...,N, 

with dfe = D\ vn . This is a rank one matrix. 

(b) At this step we need to differentiate the orthonormal basis V of the Krylov sub- 
space /C m (<x). There are different ways of computing an orthonormal basis of JC m (a). One 
choice is to use a rational Lanczos algorithm [15]. While this may be a more stable way of 
computing V compared to other approaches, differentiation formulas are difficult to derive 
and implement. We consider an alternative approach, based on the differentiation of the QR 
decomposition. 

We compute first the matrix 

(5.2) K = [{aj - A)- l h, (a m I - A^b] e R Nxm , 

and its derivatives 
(5.3) 

~ = - [(5=iJ - A^d^aJ - A)- l d k fb, (a m I - A^d^I - A)-M fc ] T b] 

Then V can be defined via the QR decomposition of K, which we write as 

(5.4) K = VR, 

where V G R Nxm , V T V = I m G R mx "\ and R € R mxm is upper triangular. If we denote 
L = R T , then (5.4) implies 

(5.5) K T K = LL T . 
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That is to say R is a Cholesky factor of K K. At the same time, when we differentiate 
(5.4), we obtain 

dV fdK dR\ , 
(5.6) =( V-—- )ir 1 , k = l,...,N. 



Since we already know dK/dr k from (5.3), in order to differentiate the mapping (b)-(c), we 
need to compute the derivative dR/dr k of the Cholesky factorization of K T K. This is given 
in the following proposition. 

Proposition 5.1 (Differentiation of Cholesky factorization). Let M e K mxm be 
a matrix with Cholesky factorization M = LL T . Given the perturbation SM of M, the 
corresponding perturbation SL of the Cholesky factor is computed by the following algorithm. 
For k = 1, . . . , m 

1 / SM kk fe -i \ 
oL kk = -p — I — ^ 2^ dLkjLkj ■ 



For i = k + 1, . . 



, m 
1 



SM ik — ^2 8L k jLij 
J =1 



fe-i \ 

Proof. The above formulas can be verified by direct computation if we write 

(5.7) SM = (5L)L T + L(8L) T 

and solve for the columns of SL, one a time, using that 5L is lower triangular. 
We use Proposition 5.1 for M = K T K, with perturbation 



SM =—Sr k , 
or k 



dM 
dr k 



to obtain 



SL = — 5r k and 

or k 



OK 
dr k 



OR 

dr k 



K + K 



dLX 



dK 
dr k 



The computation of dV/dr k follows from (5.6). 

(c) Once we have V and its derivatives we compute 



(5.8) 
(5.9) 



dA m 
dr k 

db m 
dr k 



-V T d k dlV 

dV T h 

dr k 



dV 



dV 



AV + V T A- . 
or k dr k 



(d)-(e) There are two possible ways to go from the reduced model A m , b m to the 
continued fraction coefficients Kj, Kj. Both approaches use a Lanczos iteration to obtain a 
symmetric tridiagonal matrix, which we denote by 



(5.10) 



T 



h a 2 
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For a symmetric matrix E we compute T and the columns Xj of a unitary matrix X via 
a Lanczos iteration with full reorthogonalization. Since the dimension m of the problem is 
small, we reorthogonalize at every step to ensure maximum numerical stability. The specific 
matrices E, X, and the initial vector 77 depend on which approach we consider. The iteration 
is as follows: 

Initialize xi = 77, xo = 0, fix = 0. 
For j = 1, . . . , to — 1 



W 



3+1 — (I ~ Xl:N,l:j(Xl:N, Uj) ) w j + l 



Pj+1 = ||w j+ i||, 

= o ■ 

Pj + 1 

Qi m — X m _Z?X m . 

The first approach is to compute the matrix T that is unitarily similar to the reduced 
model matrix A m , i.e. 

(5.11) T = Y T A m Y, Y T Y = I. 

Thus, in the Lanczos iteration above we take E — A m , X — Y and the initial vector 77 = 
b m /||b m ||. This allows us to combine steps (e) and (d) and thus go from the reduced model 
A m , b m directly to the tridiagonal matrix T. 

The second approach is to compute first the eigenvalue decomposition (3.7) of A m to get 
the poles — 9j and the residues Cj, for j = 1, . . . , to. Then take 

(5.12) T = QQQ T , Q T Q = QQ T = I, 

where 9 = — diag(#i, . . . ,8 m ). In the Lanczos iteration we have E = 8 and Xj are the 
transposed rows of Q, i.e. X = Q T . The entries of the initial vector are given by 



(5.13) Vl 



\| £ 

\ s=l 



From the computed T with either of the two approaches, we obtain the coefficients of 
the continued fraction using the formulas from [8]. 

(5-14) K! = — 1 , «4 



s=l 

1 

(5.15) Kj ; = — -j — ^ , J = 2,...,TO, 

K j-lPj-l K 3~l 

(5.16) Kj ■ = — 5 — , j = 2, ...,to. 



In order to differentiate the mappings (d) and (e), we need to differentiate the Lanczos 
iteration. In general, there is no explicit formula for the perturbations of the entries of T in 
terms of perturbations of E and r\. Thus, we would need to differentiate the Lanczos iteration 
directly, with an algorithm that computes the perturbations of ctj, j3j and x 3 iteratively, for 
increasing j. However, for the second approach we can use the explicit perturbation formulas 
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for the Lanczos iteration derived in [4]. To apply these formulas, we need to differentiate the 
steps (d) and (e) separately. 

The differentiation of (d) follows directly from the differentiation of the eigendecompo- 
sition (3.7) of A m . It is given by 



(5.17) 
(5.18) 
(5.19) 



dr k 
dz 
dr k 



dA m 
or k 



1 - (A m + e j I)^ — z,, 
c 

dr k 



dcj T 



z, + hi 



dzj 



where f denotes the pseudoinverse. 

For the computation of the derivatives in step (e) we use the explicit formulas from [4], 
for E = and X = Q T and r/ given by (5.13). Let us define the vectors 



( Sai \ 

Sai + 5ct2 



(5.20) 



\ 3=1 



) 



They can be expressed in terms of the perturbations of the poles = (9i, 
initial vector r/ as 



,0m) T and the 



(5.21) 
(5.22) 



S a = -Ag56 + ArjSf], 

Sp = -BeSO + B v Sr). 



Here Ag,A v £ R mxm are the matrices with entries 

m 1 

(5.23) A^=l + A^ — 



P =i p 



2Q ipQ i+l ,p ~. (,QipQi+l,j 



pQij) 



(5.24) AL>=2& 



Qi+l,jQij Am j _ „ A mj _ -. 



for i = 1, . . . , m — 1, j = 1, . . . , m and the entries of Bg,B v € 



D (m — 1) X m 



are 



(5.25) 



(5.26) 



B 8J 



5^ et - (9, 

P =i p j 

p¥=3 

(Qi+i.j) 2 



(Qi+l,p) 2 — ^^-Qi+l.pQi+l.j 



for i = 1, . . . , m — 1 and j = 1, . . . , m. 

Note that the computation of 5r) can be done by differentiating (5.13) and using (5.19). 
Once the vectors S a and 8p are known, it is trivial to obtain the individual perturbations 
6a j and 6(3j. Finally, from those perturbations we compute the derivatives of Kj and Kj 
by differentiating relations (5. 14)— (5. 16). The computation is straightforward and we do not 
include it here. 
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6. Inversion algorithm and numerical results. We solve the optimization problem 
(2.6) with a Gauss-Newton iteration. Recall that the Jacobian T>Q is a matrix in M. 2mxN . 
The Gauss-Newton iteration updates of the estimated resistivity are in the range of (T>Qy 
which has dimension 2m. This leads to low resolution in practice, because m is kept small 
to mitigate the sensitivity of the inverse problem to noise. If we have prior information about 
the unknown r' mc , we can use it to improve its estimate r*. The prior information is in the 
regularization term V[r). 

In our inversion algorithm we do the data fitting and the regularization separately. At 
each step of the Gauss-Newton iteration we compute first the update of r. Then we add to it 
corrections in the null space of T>Q, which ensures that they do not affect the residual. These 
corrections are chosen to minimize the regularization functional. To allow explicit compu- 
tation of the regularization correction, we use henceforth a weighted discrete H 1 seminorm 
regularization functional 

(6.1) V(r) = ±\\W 1 / 2 Ar\\l 
Here A is a matrix, the truncation of D, defined by 

and W G IR( Ar_1 ) x ( Ar_1 ) is a diagonal matrix of weights. We specify it later depending on 
the prior information on the true resistivity. 

Let r be the current Gauss-Newton iterate then the regularized iterate p is chosen so that 
the correction (r — p) lies in the null space of T>Q and V(p) is minimized. This can be 
achieved by solving the constrained optimization problem 

(6.2) minimize p T A T WAp. 

S.t. [Z>Q](r-p)=0 

Since (6.2) is quadratic with linear constraints, we get p from the first order optimality con- 
ditions given by the linear system 

(6.3) A T WA p + [VQ] T X = 0, 

(6.4) [DQ]p = [VQ]r, 

where A <E R 2m is a vector of Lagrange multipliers. 

In the numerical results presented in section 6.3 we consider smooth and piecewise con- 
stant resistivities, and choose W in (6. 1) accordingly. For smooth resistivities we simply take 
W = I, so that (6.1) is a regular discrete H 1 seminorm. For discontinuous resistivities we 
would like to minimize the total variation of the resistivity. This does not allow an explicit 
computation of p, so we make a compromise and use the weights introduced in [1]. The 
matrix W is diagonal with entries 

(6.5) Wj = (([Ar^f + ^r) 2 )- 1 , j = l,...,N-l, 
where <fi(r) is proportional to the misfit for the current iterate 

(6.6) 0(r) =C4Q(d(.))-Q(y(-;r)\\ 2 , 

and is some constant, which we set to l/(2m 2 ) in the numerical examples below. 

To ensure that A(r) corresponds to a discretization of an elliptic operator, we need posi- 
tive entries in r. This can be done with a logarithmic change of coordinates, which transforms 
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the optimization problem to an unconstrained one. However, in our numerical experiments 
we observed that if m is sufficiently small so that for the given data d( ■ ) all the entries of 
Q(d( ■ )) € R 2m are positive, then the Gauss-Newton updates of r remain positive as well. 
Thus, the logarithmic change of coordinates for r has not been needed in our computations. 

We use another logarithmic change of coordinates, to obtain a better scaling of the prob- 
lem and improve the quality of the reconstructions. Namely, we work with the logarithm of 
the continued fraction coefficients Kj and Kj. 

6.1. The inversion algorithm. The inputs of the inversion algorithm are the measured 
data d(t) and a guess value of m. This m reflects the expected resolution of the reconstruction, 
and may need to be decreased depending on the noise level. To compute the estimate r* of 
r true , perform the following steps: 

1. Define the interpolation nodes a via (4.5). Using the multipoint Pade scheme from 
section 4.3.1 compute (k*, Kj)jL 1 = Q(d( ■ )). 

2. If for some j either k* < or k* < 0, decrease m to m — 1 and return to step 1. 
If all (k* , k* )JLi ar e positive, fix m and continue to step 3. 

3. Define the vector of logarithms 1* = (log k\, . . . , log k^, log k*, . . . , log k^) t . 

4. Choose an initial guess r^ 1 ' € and the maximum number n GN of Gauss-Newton 
iterations. 

5. Forp = 1, . . . , n GN perform: 

5.1. For the current iterate r( p ) compute the nonlinear preconditioner 

and its Jacobian 

as explained in section 5. 

5.2. Define the vector of logarithms l^ p ) = (logK^, . . . , log , log , . . . ,logKm') T . 

5.3. Compute the step 

p (p) = - (vQ^y diag (k?\ . . . , «&>, «M . . . (1 (p) _ i*). 

5.4. Choose the step length £( p ) and compute the Gauss-Newton update 

r GN = r (p) + £(p) p (p)_ 

5.5. Compute the weight W using (6.5) with r GN or W — I. 

5.6. Solve for the next iterate r( p+1 ) from 



(DQ(p))r GN 

6. The estimate is 

r * = r ("Giv+ 1 ). 

Let us remark that since the nonlinear preconditioner is an approximation of the identity 
and most of the nonlinearity of the problem is resolved by the rational interpolation in step 
1, we may start with a poor initial guess in step 4 and still obtain good reconstructions. 
Moreover, the number n GN of Gauss-Newton iterations may be kept small. In the numerical 



(6.7) 



A T WA {VQ^y 
X>Q(p) 





r (p+l) 






X (p) 
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results presented in section 6.3 we take the number of iterations n GN = 5 for medium contrast 
resistivities and n GN — 10 for the high contrast case. The initial guess is the vector of all 
ones rW = 1 for all cases. 

In order to simplify our implementation we set the step length = 1 in step 5.4. 
However, choosing £W adaptively may be beneficial, especially for high contrast resistivities. 

While the Jacobian VQ^ is well-conditioned, the system (6.7) may not be. To alleviate 
this problem, instead of solving (6.7) directly, we may use a truncated singular value decom- 
position to obtain a regularized solution. Typically it is enough to discard just one component 
corresponding to the smallest singular value. 

6.2. Setup of the numerical experiments. We assess the performance of the inversion 
algorithm with numerical experiments. To avoid committing the inverse crime we use dif- 
ferent grids for generating the data and for the solution of the inverse problem. A finer grid 
with Nf — 299 uniformly spaced nodes is used to simulate the data, and a coarser grid with 
Nc = 199 uniformly spaced nodes is used in the inversion. 

The first term y(t; r' mc ) in (2.5) is approximated by solving the semi-discrete forward 
problem (2.1) with an explicit forward Euler time stepping on a finite time interval [0, Tmax], 
where Tmax = 100. The time step is hr = 10~ 5 , and the number of time steps is Nt = 
Tmax/hr = 10 7 . Note that even in the absence of noise there is a systematic error £( s ) (t) in 
the data that comes from the numerical approximation of the solution of (2.1). We denote by 
y the vector of length Nt, with entries given by the numerical approximation of y(t; r""°) at 
the time samples tj = jhr, 

y = (y 1 ,..., 2 / JVT ) T , y j =y(t j ;r™)+^(t j ), j = l,...,N T . 

We also define the vector £ M. Nt that simulates measurement noise. We use the 
multiplicative noise model 

(6.8) £ (n) =ediag( X i,...,Xjv T )y, 

where e is the noise level, and Xk are independent random variables distributed normally with 
zero mean and unit standard deviation. The data vector d £ M. Nt is 

(6.9) d = y + £< n) , 

and we denote its components by dj, for j = 1, . . . , Nt- 

The algorithm determines different sizes of the reduced model for different levels of 
noise. The larger e, the smaller m. The values of e and m used to obtain the results presented 
in section 6.3 are given in Table 6.1. 

Table 6.1 

Reduced model sizes m used for various noise levels e. 



e 


5 ■ 10- 2 


5 • 10~ 3 


lO" 4 


(noiseless) 


m 


3 


4 


5 


6 



The transfer function and its derivative at the interpolation points are approximated by 
taking the discrete Laplace transform of the simulated data 
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3.56e-02 



FIG. 6.1. Reconstructions ofr(x) (black solid line) after one (blue x) and five (red o) iterations. True 
coefficient by column: left ra, middle tl, right rj. Reduced model size from top row to bottom row m = 3,4,5,6. 
Noise levels are from Table 6. 1. The relative error E is printed at the bottom of the plots. 



(6.12) 



To quantify the error of the reconstructions r* we use the ratio of discrete £2 norms 

llr* - v lmc \U 



While such a measure of error is most appropriate for smooth resistivities, it may be 
overly strict for the reconstructions in the discontinuous case due to the large contribution of 
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discontinuities. However, even under such an unfavorable measure the inversion procedure 
demonstrates good performance. 

6.3. Numerical results. We show first the estimates of three resistivity functions of 
contrast two. We consider two smooth resistivities 

(6.13) r«*{x) = r Q (x) 

(6.14) r^(x) = r L {x) 
and the piecewise constant 

( 1, for x < 0.2 

(6.15) r*°°(x) = rj{x) := < 2, for 0.2 < x < 0.6 

[ 1.5, for x> 0.6 

The results are displayed in Figure 6.1, for various reduced model sizes and levels of noise, 
as listed in Table 6.1. Each reconstruction uses its own realization of noise. We use five 
Gauss-Newton iterations n GN = 5, and we display the solution both after one iteration and 
after all five. The initial guess is r^'(x) = 1. It is far from the true resistivity r' me (x), 
and yet the inversion procedure converges quickly. The features of ^""(x) are captured well 
after the first iteration, but there are some spurious oscillations, corresponding to the peaks 
of the sensitivity functions. A few more regularized Gauss-Newton iterations remove these 
oscillations and improve the quality of the estimate of the resistivity. The relative error (6. 12) 
is indicated in each plot in Figure 6.1. It is small, of a few percent in all cases. 

We also observe in Figure 6. 1 that the inversion method regularized with the nonlinear 
weight (6.5) performs well for the piecewise constant resistivity rj. Without the regular- 
ization, the estimates have Gibbs-like oscillations near the discontinuities of r"" c (x). These 
oscillations are suppressed by the weighted discrete H 1 regularization. 

In Figure 6.2 we are comparing our inversion procedure to an inversion approach that 
fits the poles and residues (0j,Cj)JL 1 instead of the continued fraction coefficients. In such 
an approach we use Q pr from (3.9) instead of Q. The comparison is done for the case of 
piecewise constant resistivity of higher contrast 

( 1, for x < 0.2 

(6.16) r ,r "(.T) = r H (x) := I 5, for 0.2 < x < 0.6 

[ 3, for x > 0.6 

As expected, the inversion procedure obtains the reconstructions of superior quality when 
Q is used. In fact, the algorithm diverges for m — 4 and m — 6 when using Q pr . Thus, for 
in = 4, 6 we plot the first and second iterates only. On the other hand, when using Q the 
inversion procedure converges in all three cases and maintains the relative error well below 
10% for m = 4, 5 and around 11% for m = 6. The reconstruction plots in Figure 6.2 
are complimented with the plots of the relative error £ versus the Gauss-Newton iteration 
number p. We observe from the behavior of £ that even in the case when the iteration with 
Q r converges (m = 5) the resulting reconstruction for Q has a much better quality (18.2% 
forQ pr versus 8.5% for Q). 

7. Summary. We introduced a numerical inversion algorithm for linear parabolic partial 
differential equations. The setup is for a one dimensional problem. It arises in the application 
of controlled source electromagnetic inversion, where the unknown is the subsurface electri- 
cal resistivity r{x) in the earth, assumed layered along the coordinate x. The data d(t) are 



:=2-4{x- 



_ 8e -100(*-0.2) 2 +x + 1 
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FIG. 6.2. Comparison between two preconditioned for high contrast piecewise constant resistivity ru(x) 
(black solid line) after one (blue X and *) and tlgn (red o and □ ) iterations. Top row: reconstructions using Q 
( n GN — 10)- Middle row: reconstructions using Q pr (n^jv = 2 form = 4, 6 and tiqn = 10 for m = 5). 
Bottom row: relative error £ versus the iteration number pfor reconstructions with Q (solid line with o) and Q pr 
(dashed line with □ ). The relative error £ is printed at the bottom of the reconstruction plots. 



time resolved measurements of the magnetic field along a direction orthogonal to x, at the 
surface x = 0. The domain of the solution is a finite interval normalized to length one, but 
essentially the same ideas apply to the half line x E (0, oo). Extensions of the algorithm to 
higher dimensions are possible. We will present them in a separate publication. 

To motivate the inversion algorithm we place the inverse problem in a model reduction 
framework. We semidiscretize in x the parabolic partial differential equation on a grid with 
N 3> 1 points, and obtain a dynamical system with transfer function G(s; r), the Laplace 
transform of the time measurements. The reduced model is a dynamical system of much 
smaller size m <C N, with transfer function G m (s) m G(s; r). Because G m (s) is a rational 
function of s, we solve a rational approximation problem. We study various such approxi- 
mants to determine which are best suited for inversion. We end up with a multipoint Pade 
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approximant G m (s), which interpolates G and its first derivatives at nodes distributed geo- 
metrically in K+. 

The inversion algorithm is a Gauss-Newton iteration for an optimization problem pre- 
conditioned with a nonlinear mapping Q. This mapping is the essential ingredient in the 
inversion. It takes functions in the space S where the unknown resistivity lies to a vector of 
2m discrete resistivities. They are the 2m coefficients in the continued fraction representation 
of the rational function G m (s). Because G m (s) is the transfer function of a dynamical sys- 
tem corresponding to a finite difference discretization of our parabolic equation, the discrete 
resistivities are also the entries in the difference operator. 

Most inversion algorithms estimate the resistivity by minimizing over resistivities r E S 
the least squares misfit of the data d(t) and the mathematical model y(t; r) of the measure- 
ments. Because our mapping Q(y(-;r)) is an approximate identity when restricted to a subset 
of sufficiently regular resistivities r, we use it as a preconditioner in our inversion approach, 
and minimize the least squares misfit of Q(d(t)) and Q(y(t;r)). The advantage is the in- 
creased stability of the inversion and very fast convergence of the iteration. 

The nonlinear preconditioner Q(y( ■ ; r)) is defined via an explicit chain of mappings. 
Each step in the chain involves a numerically stable computation. The computation of the 
Jacobian VQ follows by the chain rule and we describe it explicitly, step by step. The only 
unstable computation in the inversion is the calculation of Q(d(t j). The instability is inherited 
from that of the inverse problem and is unavoidable. We mitigate it by restricting the size m 
of the reduced model adaptively, depending on the noise level. The smaller m is, the lower 
the resolution of the estimated resistivity. This is because at each iteration the resistivity 
updates are in the range of (VQ)\ of low dimension 2m <C N. We improve the results by 
adding corrections in the null space of VQ, so that we minimize a regularization functional 
that incorporates prior information about the unknown resistivity. The performance of the 
algorithm is assessed with numerical simulations. 
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